AITopics | file name

Collaborating Authors

file name

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

How to Go Paperless in 9 Steps

WIREDNov-20-2025, 12:00:00 GMT

Has Your Pledge to Go Paperless Perished? You promised yourself you'd digitize every last receipt, document, and paper record. But the trick to getting rid of paper is to not worry about being perfect. Wanting to get rid of paper in your life is easy. Following through with that promise to yourself is hard.

artificial intelligence, inbox, promo code, (16 more...)

WIRED

Country: North America > United States (0.30)

Industry: Information Technology > Security & Privacy (0.31)

Technology:

Information Technology > Communications (0.71)
Information Technology > Artificial Intelligence (0.48)

Add feedback

Repository-Aware File Path Retrieval via Fine-Tuned LLMs

Yanuganti, Vasudha, Puri, Ishaan, Chhatre, Swapnil, Singh, Mantinder, Jallepalli, Ashok, Shrivastava, Hritvik, Sharma, Pradeep Kumar

arXiv.org Artificial IntelligenceOct-13-2025

Modern codebases make it hard for developers and AI coding assistants to find the right source files when answering questions like "How does this feature work?" or "Where was the bug introduced?" Traditional code search (keyword or IR based) often misses semantic context and cross file links, while large language models (LLMs) understand natural language but lack repository specific detail. We present a method for file path retrieval that fine tunes a strong LLM (Qwen3-8B) with QLoRA and Unsloth optimizations to predict relevant file paths directly from a natural language query. To build training data, we introduce six code aware strategies that use abstract syntax tree (AST) structure and repository content to generate realistic question-answer pairs, where answers are sets of file paths. The strategies range from single file prompts to hierarchical repository summaries, providing broad coverage. We fine tune on Python projects including Flask, Click, Jinja, FastAPI, and PyTorch, and obtain high retrieval accuracy: up to 91\% exact match and 93\% recall on held out queries, clearly beating single strategy training. On a large codebase like PyTorch (about 4,000 Python files), the model reaches 59\% recall, showing scalability. We analyze how multi level code signals help the LLM reason over cross file context and discuss dataset design, limits (for example, context length in very large repos), and future integration of retrieval with LLM based code intelligence.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2510.0885

Genre:

Research Report (0.64)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

code_transformed: The Influence of Large Language Models on Code

Xu, Yuliang, Huang, Siming, Geng, Mingmeng, Wan, Yao, Shi, Xuanhua, Chen, Dongping

arXiv.org Artificial IntelligenceJun-16-2025

Coding remains one of the most fundamental modes of interaction between humans and machines. With the rapid advancement of Large Language Models (LLMs), code generation capabilities have begun to significantly reshape programming practices. This development prompts a central question: Have LLMs transformed code style, and how can such transformation be characterized? In this paper, we present a pioneering study that investigates the impact of LLMs on code style, with a focus on naming conventions, complexity, maintainability, and similarity. By analyzing code from over 19,000 GitHub repositories linked to arXiv papers published between 2020 and 2025, we identify measurable trends in the evolution of coding style that align with characteristics of LLM-generated code. For instance, the proportion of snake\_case variable names in Python code increased from 47% in Q1 2023 to 51% in Q1 2025. Furthermore, we investigate how LLMs approach algorithmic problems by examining their reasoning processes. Given the diversity of LLMs and usage scenarios, among other factors, it is difficult or even impossible to precisely estimate the proportion of code generated or assisted by LLMs. Our experimental results provide the first large-scale empirical evidence that LLMs affect real-world programming style.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2506.12014

Genre: Research Report > New Finding (0.46)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Otter: Generating Tests from Issues to Validate SWE Patches

Ahmed, Toufique, Ganhotra, Jatin, Pan, Rangeet, Shinnar, Avraham, Sinha, Saurabh, Hirzel, Martin

arXiv.org Artificial IntelligenceFeb-7-2025

While there has been plenty of work on generating tests from existing code, there has been limited work on generating tests from issues. A correct test must validate the code patch that resolves the issue. In this work, we focus on the scenario where the code patch does not exist yet. This approach supports two major use-cases. First, it supports TDD (test-driven development), the discipline of "test first, write code later" that has well-documented benefits for human software engineers. Second, it also validates SWE (software engineering) agents, which generate code patches for resolving issues. This paper introduces Otter, an LLM-based solution for generating tests from issues. Otter augments LLMs with rule-based analysis to check and repair their outputs, and introduces a novel self-reflective action planning stage. Experiments show Otter outperforming state-of-the-art systems for generating tests from issues, in addition to enhancing systems that generate patches from issues. We hope that Otter helps make developers more productive at resolving issues and leads to more robust, well-tested code.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2502.05368

Country: North America > United States > New York (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Document Type Classification using File Names

Li, Zhijian, Larson, Stefan, Leach, Kevin

arXiv.org Artificial IntelligenceOct-1-2024

Rapid document classification is critical in several time-sensitive applications like digital forensics and large-scale media classification. Traditional approaches that rely on heavy-duty deep learning models fall short due to high inference times over vast input datasets and computational resources associated with analyzing whole documents. In this paper, we present a method using lightweight supervised learning models, combined with a TF-IDF feature extraction-based tokenization method, to accurately and efficiently classify documents based solely on file names that substantially reduces inference time. This approach can distinguish ambiguous file names from the indicative file names through confidence scores and through using a negative class representing ambiguous file names. Our results indicate that file name classifiers can process more than 80% of the in-scope data with 96.7% accuracy when tested on a dataset with a large portion of out-of-scope data with respect to the training dataset while being 442.43x faster than more complex models such as DiT. Our method offers a crucial solution for efficiently processing vast datasets in critical scenarios, enabling fast, more reliable document classification.

classifier, dataset, file name, (15 more...)

arXiv.org Artificial Intelligence

2410.01166

Genre: Research Report > New Finding (0.88)

Industry:

Information Technology > Security & Privacy (0.48)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Emotion Detection using Python - Geeky Humans

#artificialintelligenceDec-27-2021, 00:46:27 GMT

In this tutorial, we'll see how we can create a python program that will detect emotion on a human face. This might be interesting if you want to do things like emotion detection using python, or if you're training machine learning systems to read human emotions. We're going to create a program that takes an image as an input and outputs a list of human emotions that the image invokes. To do this, we're going to use a package called Deepface. Deepface is an open-source face recognition attribute analysis framework that was created for python.

emotion, emotion detection, geeky human, (10 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

Machine Learning

#artificialintelligenceDec-7-2021, 00:10:07 GMT

For processing the data we need some packages. Then we need a package to store our data. Alright, now that we have our packages, let us create a variable for our two paths, one for the "all" folder and another for the "hem" folder": Now, we need to point Python toward these folders(path variables above) and store the file names within them as a list. These two lines give you two lists, one for the "all" images and another for the "hem" images: Alright, now it is time to store the data from these images. The first line in the above code creates an empty data frame.

concat command, data frame, machine learning, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.86)

Add feedback

BRIEF: Everything We Know About 1970s Mainframe RPGs We Can No Longer Play

#artificialintelligenceJul-1-2021, 03:05:36 GMT

A PLATO terminal in a museum case at the University of Illinois; photo taken by the author in 2013. This entry summarizes a series of 1970s mainframe games that have been so lost we don't even have screenshots. I also asked several dozen PLATO authors, administrators, and former CRPG Addict contributors--everyone I could find--for any additional recollections about the games. I stopped only when I was confident there was nothing left to learn. If you have any new or conflicting information about any of the games below, I welcome your comments below or an e-mail to crpgaddict@gmail.com. I will update the information below with any new material discovered. However, please do not take it upon yourself to try to track down and contact any of the people listed here on my behalf; it is likely that I have already reached out and they either declined to respond or already told me all they could. Except for Don Daglow's Dungeon, all the games listed below were written in a language called TUTOR for the PLATO educational mainframe hosted by the University of Illinois Urbana-Champaign. Many of the games written on this system have been preserved and are playable today at Cyber1.

daglow, dungeon, recollection, (16 more...)

#artificialintelligence

Country:

North America > United States > Illinois > Champaign County > Urbana (0.24)
North America > United States > Iowa (0.05)
North America > United States > California (0.04)
North America > United States > Indiana (0.04)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Education > Educational Setting (0.70)

Technology:

Information Technology > Artificial Intelligence (0.68)
Information Technology > Communications (0.47)

Add feedback

6 Python Projects You Can Finish in a Weekend

#artificialintelligenceJun-29-2021, 03:00:09 GMT

Learning Python can be difficult. You might spend a lot of time watching videos and reading books; however, if you can't put all the concepts learned into practice, that time will be wasted. This is why you should get your hands dirty with Python projects. A project will help you bring together everything you've learned, stay motivated, build a portfolio and come up with ways of approaching problems and solving them with code. In this article, I listed some projects that helped me level up my Python code and hopefully will help you too.

python, python code, recommendation system, (15 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.32)

Industry: